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£h ' Abstract 

H-~5 . Tradeoffs between the information rate and fidelity of quantum error-correcting codes are discussed. Quantum 
I ' channels to be considered are those subject to independent errors and modeled as tensor products of copies of a general 
çSJ , completely positive linear map, where the dimension of the underlying Hilbert space is a prime number. On such a 
■ quantum channel, the highest fidelity of a quantum error-correcting code of length n and rate R is proven to be lower 
C*~) | bounded by 1 — exp[— nE(R) + o(n)] for some function E(R). The E(R) is positive below some threshold Ro, a direct 
^* . consequence of which is that Rq is a lower bound on the quantum capacit y. This is an extcnsi on of the author's previous 
^ 'result [M. Hamada, Phys. Rev. A, vol. 65, 052305, 2002; LANL e-Print, |quant-ph/01091l4 2001]. While it states the 
^— "j , result for the depolarizing channel and a slight generalization of it (Pauli channels) , the result of this work applies to 
' general discrete memoryless channels, including channel models derived from a physical law of time evolution. 
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C ■ I. Introduction 

d ; 

Quantum error-correcting codes (also called quantum codes or codes in this work) have attracted 
Pjmuch attention as schemes that protect quantum states from decoherence during quantum computa- 
k*" ■ tion. Shor invented the first code and stated that the ultimate goal would be to define the quantum 
^ , analog of Shannon's channel capacity, and find encoding schemes which approach this capacity |fjj . 
^ \ On quantum memoryless channels, several bounds on the quantum capacity are known H , p , || , || , 
||. Good surveys on this problem are given in the introductory section of f| and in ||. There is a 
conjecture that the known upper bound based on the notion called coherent information is tight 0, 
|5], Section VI]. On the other hand, the existing lower bounds seem to have left much room for im- 
provement. For example, there is a lower bound on the capacity of the so-called depolarizing channel 
which can be proved by a random coding argument that evaluates the average performance over the 
whole ensemble of Standard quantum error-correcting codes ||, ||, or by an argument using an en- 



tanglement purification protocol ||. Shor and Smolin |fL0| , argued that this bound is not tight 
showing the existence of quantum codes, which are, in a sense, analogous to classical concatenated 
codes ||1 1|| , of performance beyond it for a limited class of very noisy channels. The present author 



recently strengthened the result on Standard quantum error-correcting codes ||, || in another direc- 
tion, namely, established exponential convergence of fidelity of codes used on slight gener alizat ions of 
the depolarizing channel |Ï2"| . In other words, using these simple channels, he illustrated that certain 



results and ideas around the error exponent problem in classical information theory, which has been a 
central issue [Tj|, ]T5| , ]Iü|| , flT7|] , |IB|] , JTP|, can be extended to quantum channels. The classical 



error exponent problem is, roughly speaking, to determine the function E C \(R, W) such that the decod- 
ing error probability P* of the best code of length n and rate R behaves like P* ~ exp[— nE c \(R, W)] 
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on a channel W. The E C \(R, W), which is called the reliability function or the highest achievable error 



exponent of a channel W, is positive below the capacity of W, and decreasing in R. See, e.g., JT4 



Ï5| for precise definitions of the reliability function, [ï(| for a recent development, and [ Ï9fl , pÜ| for 



history. There is no reason to employ codes of rates near the capacity exclusively because the less R 
is, the greater E C \(R, W) is, and hence the less P* ~ exp[— nE c \(R, W)] is exponentially. 

The goal of this work is to show such exponential convergence of the fidelity of quantum error- 
correcting codes on a much wider class of channels. The channels to be considered here are those subject 
to independent errors and modeled as tensor products of copies of a general completely positive (CP) 
linear map f2~If, Our channel class includes those derived from a physical law of time evolution, 



or from master (Lindblad) equations 0, [^4|], [23], though it is stipulated that the Hilbert spaces 



underlying channels have dimensions of prime numbers. One example of such channels is the amplitude- 
damping channel, which has often been discussed in the context of quantum error correction ||, |24| 



26| . Despite the fact that this channel has often been treated as a model of quantum noise suffered 
during quantum computation, it has been not known whether Standard quantum error- correcting codes 
work reliably at a positive rate for all large enough code lengths on this channel. 

This work was inspired by Matsumoto and Uyematsu ||27|| , who tried to prové a lower bound on the 
quantum capacity of a general memoryless channel using Standard quantum error-correcting codes. 
However, their proof turned out to be wrong unfortunately [R. Matsumoto and T. Uyematsu, 24th 
Symposium on Information Theory and Its Applications, Kobe, Hyogo, Japan, Dec. 7, 2001]. In fact, 
they used the inequality similar to that in Lemma 5 below, which allegedly held for the Standard 
fidelity measure (minimum fidelity, denoted by F(C) in this paper), in ||271| , but this fails as shown in 
Example 3 below. Moreover, their bound p7j is smaller than Preskill's lower bound || Section 7.16.2] 



for the so-called Pauli channels in general. It may be said that their contribution lies in the use of 
the estimate due to Calderbank et ai, which will be given in Lemma § below in a slightly different 



form, in the present context. This is what this work has inherited from |[27|| . Thus, the question of 
whether quantum error-correcting codes work reliably on general channels or not is yet to be answered, 
which this paper is concerned with from an information-theoretic viewpoint. Specifically, exponential 
convergence of the fidelity of codes on general memoryless channels is established. The proof to be 



presented below exploits existing information-theoretic techniques, such as the method of types [TJ 
p8| , |2T| , [ |3Ü|| , as well as a previously unused property of Standard quantum-error-correcting codes. 



We remark that in the setting where classical messages are sent over quantum channels, the error 
exponent problem has been discussed by Burnashev and Holevo |3Ï|] and Holevo while this paper is 



concerned with the problem of preserving or transmitting quantum states in the presence of quantum 
noise. Note also that the error exponents of quantum error- detecting codes, which do not correct errors 
but only detect errors, have been discussed by Ashikhmin et al. |3~3~ . 



The rest of the paper is organized as follows. Section II presents the main result. In Section III, a 
performance measure for codes, which is called the minimum average fidelity, is introduced and it is 
argued that evaluating this measure gives a good estimate for the Standard fidelity. Section IV reviews 
the Standard quantum codes, and Section V gives bounds on the minimum average fidelity of codes. 
Finally, the main result is proved in Section VI, which is followed by a concluding section. Appendices 
are given to prové a proposition, two lemmas, and an inequality between the proposed bound and the 
previously known one. 

II. Main Result 

As usual, all possible quantum operations and state changes, including quantum channels, are 
described in terms of completely positive (CP) linear maps ||21|| , [J22| ], ]2[], H. In this work, only 
trace-preserving completely positive (TPCP) linear maps are treated. Given a Hilbert space H of 
finite dimension, let L(H) denote the set of linear operators on H. In general, every CP linear map 
A4 : L(H) — > L(H) has an operator-sum representation Aí(p) = ^2 ieI MipM[ with some set of operators 
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{Mi G L(H)}j e j, which is not unique P2| , [g. When M. is specified by a set of operators {Mj}j g j in 
this way, we write Ai ~ {Mj}j g j. Note that we can always have |X| equal to (dimH) 2 , including null 
operators in {Mj}j g j if necessary [[22] . 



Hereafter, H denotes an arbitrarily fixed Hilbert space of dimension d, which is a prime number. A 
quantum channel is a sequence of TPCP linear maps {A n : L(H® n ) — > L(H' x,n )}; the map A n with a 
fixed n is also called a channel. We want a large subspace C Ç H®" - every state vector in which remains 
almost unchanged after the effect of a channel followed by the action of some suitable recovery process. 
The recovery process is again described as a TPCP linear map 1Z : L(H® n ) — > L(H® n ). A pair (C,7t) 
consisting of such a subspace C and a TPCP linear map 1Z is called a code and its performance is 
evaluated in terms of the minimum fidelity p6fl , Q, 

F(C,7^ n ) = mm^AílV'XV'IM, (1) 
|V}ec 

where 7?.^4 n denotes the composition of A n and 7?.. Throughout, bras (-| and kets |-) are assumed 
normalized. Sometimes, a subspace C alone is called a code assuming implicitly some recovery operator. 
Let F* k (A n ) denote the supremum of F(C, lZA n ) such that there exists a code (C, 71) with log d dim C > 
k, where n is a positive integer and k is a nonnegative real number. This paper gives an exponential 
lower bound on F* k (A n ), where for simplicity we state the result in the case where the channel is 
memoryless, i.e., when A n = A® n for some A : L(H) — > L(H); the channel {A n = A® n } is referred to 
as the memoryless channel A. 

The codes to be proven to have the desired performance are symplectic (stabílízer or additive) 
codes p5| , [ |36|1 , p7| , |38| , |39[| . In designing these codes, the following basis of L(H), which 



has some nice algebraic properties, is used. Fix an orthonormal basis (ONB) {|0), . . . , \d — 1}} of H. 
The 'error basis' is N = {Nu^ = X % Z^{i^ x where X = {0, . . . , d — l} 2 and the unitary operators 
X, Z G L(H) are defined by 

X\j) = \(j-l)modd), Z\j)=tJ\j) (2) 

with íü being a primitive <i-th root of unity j|0], Section IV-15]. When d = 2, the basis elements 
become /, X, XZ, Z, which are the same as the identity and three Pauli operators up to a phase factor. 
As usual, the classical Kullback-Leibler information ( informat ional divergence or relative entropy) is 
denoted by D and entropy by H |TE| , [p8fl , p0| . Specifically, for probability distributions P and 
Q on a finite set X, we define D{P\\Q) by D(P\\Q) = J2 xe x p ( x ) ^g d [P{x) /Q(x)} and H{Q) by 
H{Q) = —J2xexQ(. x )^°&dQ(. x )- By convention, we assume log(a/0) = oo for a > and OlogO = 
01og(0/0) = 0. 

To state our result, we associate a probability distribution with a channel. 

Definition 1: For a memoryless channel A : L(H) — > L(H), we define a probability distribution 
P_a = Pa,n on X as follows. For an operator-sum representation A ~ {A u } u( zx, expand A u in terms of 
the error basis N as A u = Ylvex a uvN v , u G X. Then, 

Pa(v) = Pa,n{ v ) = l a ™! 2 ' veX. 

O 

Remarks: With A and N fixed, the P4 does not depend on the choice of {A u } u( zx while it depends on 
N as well as A. That J2 v &x P( v ) = 1 readily follows from the trace-preserving condition J2 u çx AÍA U = 
I and the property of the basis N that NlN v = I if and only if u — v [p7| . □ 

This paper's main result is the following one. 

Theorem 1: Let integers n, k and a real number i? satisfy < k < \Rn\ and < R < 1 (a typical 
choice is k = \Rn) for an arbitrarily fixed rate R). Then, for any memoryless channel A : L(H) — > L(H), 
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and for any choice of the basis {|0),...,|d— 1)} and uj which determine N, we have 

Fl k {A® n ) > 1 - 2d 2 (n + lf^~l) d -nE{R,P AM ) 

where 

E{R,P) = mm[D(Q\\P) + |1 - H(Q) - R\+), 
Q 

\x\ + = max{x, 0}, the minimization with respect to Q is over all probability distributions on X = 

{(' 1} 2 - " o 

An immediate consequence of the theorem is that the quantum capacity , , j| , of A is lower 
bounded by 

max[l-P(P AN )], (3) 

where the maximum is over all choices of the basis {|0), . . . , \d — 1)} of H and the primitive <i-th root 
of unity uj. To be precise, the capacity of {A n } is defined as the supremum of achievable rates on 
{^4 n }, where a rate R is said to be achievable if there exists a sequence of codes {(C n ,lZ n )} such that 
liminf n log d dimC n /n > R and lim n F(C n , lZ n A n ) = l.Q To see the bound, observe that E(R,P) is 
positive for R < 1 — H(P) due to the bàsic inequality D(Q\\P) > where equality occurs if and 
only if Q = P \IE\. The bound 1 — P(P^) appeared earlier in Preskill 0, Section 7.16.2] in the 



case where d = 2 and (a uv ) is diagonal. The restriction of (a uv ) being diagonal also exists in this 
author's previous result |T^]. Namely, it treated channels of the form A ~ {y/ ' P{u)N u } u£X with some 
probability distribution P on X, which are sometimes called Pauli channels especially for d = 2. 
Another direct consequence of the theorem is 

liminf--log d [l - K,RÀ A ® n )\ ^ ™xE(R,P AtN ), (4) 

where the range of the maximization is the same as that for (|3D above. This bound resembles the 
random coding exponent E T (R, W) of a classical channel W. As mentioned in |Ï2[], the function 



E(R, P) is, in fact, the 'slided' random coding exponent E V (R + 1, W) of some simple classical channel 
W, i.e., the additive channel defined by W{y\x) = P(y — x), x,y G X = Z/dl*, which becomes the 
quaternary (completely) symmetric channel |^TJ in the case where d = 2 and A is the depolarizing 
channel. In [0, one can find another form of E, which is the translation of an older form of classical 
random coding exponent E r known in the literature (see, e.g., pp. 168, 192-193, and JIB| , 



and suitable for computing E(R, Pa,n) numerically (Fig. also Fig. 1 of 

It should be remarked that, for the obvious reason, the bounds in (§) and (f|) actually can be replaced 
by 

max[l - H{P U a,h)\ and maxP(P, Pua,n), 

Vi , N IA ) N 

where UA denotes the composition of A and U, the map IÀ ranges over all TPCP ones on L(H), and 
the range of N is the same as above. The role of IÀ is preprocessing before the recovery operation TZ, 
so that restricting the range of U to the set of easily implementable ones, say, to that of all unitary 
maps of the form U(p) = UpU^ with some unitary operator on H, may be reasonable. 

1 In the literature, lim inf n log d dimC„/n > R is sometimes replaced by lim sup n log d dimíL/n > R (e.g., Jò)). Note also that 
in the definition of the quantum capacity (for transmission of subspaces) by Barnum et al. (aj, a slightly more general setting is 
assumed, i.e., two Hilbert spaces H s and H c are used instead of H, but our bound is also vàlid in their setting because we can put 
H s = H c = H. Apart from this difference, there is a seemingly different definition of the quantum capacity using entanglement 
fidelity, but actually they are the same M. 
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Fig. 1 

The function E(R,P) = E(R,p) in the case where d = 2 and P((0,0)) = 1 - p, P{u) — p/3 for u ^ (0,0), 

U€ X = {0, l} 2 , WHICH APPLIES TO THE DEPOLARIZING CHANNEL. 



In the case of the depolarizing channel, the relationship between this paper's bound (or that of 
P"2"|| ) and the previously known bounds are best understood with Fig. [T], which depicts E(R,Pa,n) = 
E(R,P) = E{R,p) in the case where d = 2 and P((0,0)) = 1 - p, P{u) = p/3 for u + (0,0), 
u G X = {0, l} 2 with p = 1.5 x 10~ 3 j, j = 0, 1, ... . This applies to the depolarizing channel 
A ~ {^1 — pi, a/p/3X, a/p/3 XZ, a/p/3 Z}. For this channel, the known bound 1 — Hi(p) [||, Fig. 8], 
i, [|, where 

Hi(p) = -plog 2 (p) - (l-p) log 2 (l -p) +plog 2 3, 

appears in Fig. [I] as the curve on which the surface E(R,p) meets the horizontal pi?-plane. The Shor- 
Smolin code JÏ0[ , Q has improved this lower bound slightly for a limited range of p around the point 
(p*, 0, 0), where the lower bound 1 — Hi(p) vanishes [1 — Hi(p*) = 0, p* ~ 0.1893]. 

Maximization of the bound E(R, Puam) or 1 — H(P U a,n) with respect to the basis N and the TPCP 
map IÀ seems troublesome and is largely left untouched except for the following simple case. 

Proposition 1: Let a channel A ~ {^j-xe* be given by A x = Q(x) N x , x G X, where iV(»j) = 
X l Z\ X and Z are defined by 

X\bj) = \b {j - 1)modd ), Z\bj) = uJ J \bj) 

similarly to (|2|), with and ui being an ONB of H and a primitive d-th root of unity, respectively, 

and Q is a probability distribution on X . Then, the maximum of 1 — H(PuA,n) with respect to N and 
U, i.e., with respect to {|0}, . . . \d— 1)}, uj and U, where IÀ : L(H) — > L(H) ranges over all unitary maps, 
is achieved by \j) = \bj), j — 0, . . . , d — 1, u — lo, and lí — I, where I denotes the identity map on 

L(H). ,' ' , 

A proof is given in Appendix A. 
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Next, we consider general channels. In a setting where elaborated coding schemes that rely on 
purification protocols are allowed, the lower bound 1 —H\{jp'), as well as the Shor-Smolin improvement 
on this, for a general channel A with d = 2 was known before this work |§, [42], JÏO, the last paragraph], 
where 



p = 1 - max(ri\{I®A](\$ + )(<è + \)\r)), 
v 



(5) 



|$ + ) = 2~ 1 / 2 (|00) + 1 11)), and the maximum is over all completely entangled states r]. We compare our 
bound with the bound 1 — Hi(p f ), which is 'almost' the best among those previously known in the sense 
that the known improvement outperforms this only if 1 — 0.8115 = 0.1885 < p' < 1 — 0.8094 = 0.1906 
and the difference between 1 — Hi(p') and the improved one is at most 10~ 2 j|, Fig. 8]. As is proved 
in Appendix B, for every basis N defined with (0) for some {|0), |1)}, where d = 2, there exists some 
unitary map U satisfying 



i-H(P UAn ) yi-H^p'). 



(6) 



Roughly speaking, the gain of this paper's bound comes from the fact the bound has the form 1 — 
H{Pua,h) = 1 - H((l - p',Pi,P2,P3)) = 1 - h(p') - p'H((p 1 /p',p 2 /p',P3/p')), and for a fixed p' = 
Pi +í>2 +í>3 > 0, its minimum is 1 — Hi{p') (reached when pi = p 2 = P3); Bennett et a/.'s scheme || 
loses information on M = [I®^4](| ( í )+ )($ + |) by 'twirling' (a random bilateral rotation), which increases 
entropy of M as high as to Hi{p'). 

The next example illustrates the advantage of this work. 

Example 1. Let us consider the amplitude-damping channel whose Kraus operators are 



A 



(0,0) 







and A 



(1,0) 







in matrix form with respect to the basis (|0), |1)}, where d = 2 and < 7 < 1. This channel has often 
been discussed as a reasonable model in the context of quantum error correction ||, Section 3.4.2], [|4|, 
Chapter 8], f26j while to this author's knowledge, it was not known if any positive rates were achievable 



by Standard quantum error-correcting (stabilizer) codes on this channel. The ^4(o,o) an d ^.(1,1) can be 
expanded, respectively, as 



.4 



(0,0) 



1 + VT 



and 



Regarding A 



A 



(1,0) 



(X-XZ). 



(o,i) = ^4(i,i) as the null operator, we have 
^((0,0)) 

^((0,1)) 



(2- 7 + 2 v /T^)/4, 
(2 



Pa 
Pa 



7 - 2v/r^)/4 

Hence, our lower bound to the quantum capacity of this channel is 



1,0) 
1,1) 



7/4, 
7/4. 



1 - H{P A ) = l-h 



7 



1 




(7) 



This bound actually achieves the maximum of 1 — H(P^ ^i) with respect to N' as can be checked by a 
direct calculation and the concavity of entropy. 

This bound, together with the previously known one 1 — Hi(p') with (|5|), is plotted in Fig. [2], where 
p' is calculated in Appendix B, Example 4. 

□ 
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Fig. 2 

TfflS PAPER'S BOUND f = 1 - H(Pj í ) IN (0), DRAWN AS SÒLID LINE, AND THE PREVIOUSLY KNOWN ONE 
g = 1 — Hi(p'), DOTTED LINE, WITH p' = 1 — (2 — 7 + 2-^/1 — 7)/4 FOR THE AMPLITUDE-DAMPING CHANNEL IN 



EXAMPLE 1. Sl·lOR AND SMOLIN [[[(J , Q SUCCEEDED IN IMPROVING 1 — H 1 (p') BY AN AMOUNT LESS THAN 10 2 

FOR SOME VÀLUES OF p' WITH 1 — Hi(p') < 1CT 2 . 



III. MlNIMUM AVERAGE FlDELITY 

The minimum fidelity given in is the simplest criterion for design of quantum error correction 
schemes. A known substitute for the minimum fidelity is the entanglement fidelity 0]. It turns 
out that yet another criterion is useful to establish Theorem [TJ: We seek codes of large minimum 
average fidelity. The minimum average fidelity F a (C) = F a (C, 7ZA n ) of a code (C, 7Z) used on a channel 
A n : L(H® n ) -> L(H 0n ) is defined by 

F a (C)=mml^F(^^ n ) (8) 

where FfyjTZAn) = (ip\7ZA n (\ip) (ip\)\ip) , K is the dimension of C, and the minimization with respect 
to B is taken over all ONBs of C. Note that the minimum exists since the minimization can be written 
as that of a continuous function defined on a compact set. According to Schumacher 0, any average 
fidelity, and hence the minimum average fidelity are not less than the entanglement fidelity. 

Employing the minimum average fidelity may need an account. In the previous work [fil! , Theorem [ï] 
was proved for memoryless channels of the form A ~ {y/ ' P{u)N u } u& x- In this case, F(C) is trivially 
lower bounded by the sum of probabilities of errors that are correctable by C. The major difficulty 
in analysis on general channels lies in the fact that this bound is no longer true in general. However, 
as we will see in the sequel, a similar bound holds for a properly chosen symplectic quantum code if 
we replace F by the minimum average fidelity F a . Furthermore, an estimate for F a (C) automatically 
gives one for F(C) by the following lemma. 

Lemma 1: Let the minimum average fidelity F & {C) = F a (C, lZA n ) of a code (C, 71) used on a channel 
A n : L(H® ri ) -> L(H® n ) satisfy 

1 - F a (C) < G 

for some constant G, and assume C has dimension K > 2. Then, there exists a [-^/2j-dimensional 
subspace V of C whose minimum fidelity F(V) = F(V,7ZA n ) fulfills 

1 - F(V) < 2G. 
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o 

Proof. Let a normalized vector ipi minimize F(ip) = (ip\TZAn(\ip) among those in C (= Cq), 
and let C\ be the orthogonal complement of span{^i} in C, which means C = C\ © spanj^i}. Next, 
let ip2 minimize F(ip) among those in Ci, and let C 2 be the orthogonal complement of span-{/0i, ^2} m 
C, which means C = C 2 © span{^i, ^2}- Continue in the same way until we obtain ip\K/2\ and C\k/í\- 
Put V = C\k/í\ ■ We annex an arbitrarily chosen ONB {ij)\K/ï\+i, ' " ' > ^k} of V to {^1, • • • , ^\k/í\ } to 
form an ONB of C. Now put e(ip) = 1 — F(ip). Then, by construction, 

1-FiV) < e(^ Km ) 

ej'ipi) + ■ ■ ■ + e(ip {K/2 -]) 
\K/2] 



(A) + ■ ■ ■ + e{j) K ) 
< 2G, 



e 

< 2- 

K 



as promised. □ 
This lemma and its proof are analogous to those known in the classical information theory H 



p. 140. A similar idea was used by Barnum et al. ||, where they adopted entanglement fidelity in 
place of minimum average fidelity This lemma means that a properly chosen subcode D of C works 
without any loss of asymptotic performance. 

IV. Codes based on Symplectic Geometry 

To prové the theorem, we use symplectic quantum codes, so that we shall recali bàsic facts on them 
in this section. We can regard the index of Nu^ = X l Z\ G X, as a pair of elements from 

the field F = = Z/dZ, the finite field consisting of d elements. From these, we obtain a basis 
N n = {N x | x G (F 2 ) n } of L(H® n ), where N x = N Xl (8) . . . <g> N Xn for x = {x u . . . , x n ) G (F 2 ) n . We write 
Nj for {N x G N n | x G J} where J Ç (F 2 ) ra . The index of a basis element 

(( Ul , Vl ),...,(u n ,v n )) G (F 2 ) n 

can be regarded as the plain 2n-dimensional vector 

x = (u 1 ,v 1 , . . . ,u n , v n ) G F 2 ". 

We can equip the vector space F 2n over F with a symplectic bilinear form (symplectic pairing, or inner 
product), which is defined by 

n 

(x,y) sp = ^u l v' i -v i u' i (9) 

i=l 

for the above x and y = (u[, v' x , . . . , u' n , v' n ) G F 2íl |44| . Given a subspace L Ç F 2n , let 

L ± = {xeF 2n \WyeL, (x,y) sp = 0}. 



Lemma 2: [Q, Let a subspace L Ç F 2n satisfy 

L Ç L L and dimL = n — m. (10) 

Choose a set J Ç F 2 " such that 

{y-x\xe J, ye J}Ç(L ± \L) C , (11) 
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where the superscript C denotes complement. Then, there exist d n m subspaces of the form 



G H® n | VM G Nl, M^) = t(M)i/>} 



(12) 



each of which has dimension d m , where r(M) are scalars, and hence eigenvalues of M G N^. The direct 
sum of these subspaces is the whole space H® 71 and each subspace together with a suitable recovery 
operator serves as an iVj-correcting quantum code. O 
Remarks. A precise definition of iVj-correcting codes can be found in Section III of [26| and the 
above lemma has been verified with Theorem III. 2 therein. Most constructions of quantum error- 
correcting codes relies on this lemma, which is vàlid even if d is a prime other than two | 
39fl ; related tòpics have been discussed in |fj5| , |46Í , f47[ 



In this paper, we call the quantum codes 



in Lemma 

term. Symplectic codes are often called additive codes p5| , p6| or stabilizer codes 



symplectic quantum codes or symplectic codes while Rains [|3£| indicates L by the latter 

0,1 



and the 
□ 



set Nl in the lemma is called a stabilizer in the literature. 

The next lemma, which immediately follows from Lemma 0, will be used in the proof of Theorem |I] 
below. 

Lemma 3: [35||, [j36|j As in Lemma |2|, assume a subspace L Ç F 2 " satisfies (Ï0|). In addition, let 



J Q Ç F be a set satisfying 



Vx,y e J , [y-xe L 1 



x 



V 



(13) 



Then, the condition ([TTJ) is fulfilled, so that the d n ~ m codes of the form ( |T2"D are cP-dimensional 
A^j -correcting codes. O 
We assume the next in what follows. 

Assumption. When we speak of an iVj-correcting symplectic code C, the recovery operator 1Z for 
the code is always the one presented by Knill and Laflamme p6| , proof of Theorem III. 2. O 

Note that the 1Z is determined from C and J in general. In the present case where C is a symplectic 
quantum code in Lemma |3| (or Lemma [7| below), the recovery operator 1Z can be written explicitly, 
viz., 1Z ~ {II rcst } U {N^H r } re j , where n r is the projection onto N r C = {N r ip \ ip G C}, and n rest is 
the projection onto the orthogonal complement of © rg j N r C in H <x ' ri . The premise (0) of Lemma [3] 
can be restated as that Jo is a set of representatives of cosets of L 1 - in F 2n . When the code is used 
on a channel A n ~ {^/P n (x)N x }, a natural choice for J would be a set consisting of representatives 
each of which maximizes the probability P n (x) in the coset [36] since it is analogous to maximum 
likelihood decoding, which is an optimum strategy for classical coding (see Slepian [48|] or any textbook 
of information theory). In the proof below, we choose another set of representatives, the classical 
counterpart of which (minimum entropy decoding) asymptotically yields the same performance as 
maximum likelihood decoding ||15|| , |29[ . 



V. Bound on Minimum Average Fidelity 

A . Plan of Proof 

Our strategy for proving Theorem [I] is to employ the random coding technique known in classical 
information theory |T3"|| , ]TJ], [fEJ, [T^|. A typical random coding argument goes as follows. Suppose 
F'(C) is a measure of performance, which is the minimum average fidelity in our case, of a code C and 
we want to prové the existence of a code C with F'(C) > G. We take some ensemble S of codes, and 
evaluate the ensemble average J2ces K the average is lower bounded by G, then we can 

conclude at least one code C in £ has performance not smaller than G. In what follows, we will use 
this proof method twice, that is, first, with L fixed and S being the set, say S(L), of d n ~ m subspaces 
in Lemma ||| or |3|, and second, with £ consisting of all L satisfying (|Ï0|) . 
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B. PreskilVs Lower Bound on Fidelíty 

Preskill showed an interesting lower bounds on the minimum fidelity of a code used on quantum 
channels, which will be presented in a slightly different form here. 

Lemma 4'- @ For a channel A : L(H® n ) — > L(H® n ), an iVj-correcting code (C Ç H®", TZ) and any 
state G C, the fidelity F(^) = (V>|7lA(|V'}('0|)|'0} is bounded by 



where B x = Y, y ejc a^A^, x e X n . O 
This is Preskill's lower bound ||, Section 7.4.1, Eq. (7.58), and the above form can be obtained by 
rewriting the channel, which was described in terms of unitary evolution of a state of an enlarged system 
and a partial trace operation, into an operator-sum representation. In Appendix C, an alternative proof 
which uses only operator-sum representations is presented. 

C. Minimum Average Fidelity Bound for Symplectic Codes 

To evaluate the minimum average fidelity of codes, we first associate a sequence of probability 
distributions {Pa„} with the channel {A} on which codes are to be evaluated. 

Definition 2: For each n, let A n ~ {A x ? } x ex n , expand Ax as Ax = Yuyex* a xy N y , x G X n , and 
define a probability distribution P^ n on X n by 

X 

o 

That ^2 xeXn Pa„ ( x ) — 1 readily follows, again, from the trace-preserving condition ^2 xeXn Ax Ax = 
I and the property of the basis N n that N\N y = I if and only if x — y [ 57 ]. 

Example 2. Let {A n } be a memoryless channel A n = A" 1 , n = 1, 2, . . . . It is easy to see that 

n 

PA n {yx,---,Vn) = \[PA{yi) (14) 

where Pa = P has already appeared in Definition [[]. □ 

The next is a result of the first application of random coding technique in this paper. 

Lemma 5: As in Lemma [|, let a subspace L Ç F 2n satisfy (|IÜD and (|ÏID with so me J Ç F 2n , and let 
Ai : L(H® n ) -»■ L(H® n ) be a channel (TPCP linear map). With L, J and A n fixed, let C(L) achieve 
the maximum of F a (C) = F a (C,TZA n ) in S(L) (see Section |V-A|) , i.e., the maximum among the d n ~ m 
symplectic codes associated with L as in Lemma |2] or || Then, 

l-F a (C(L)) <^P^(x). 

O 

Proof. Taking the averages over an ONB B of a code C of both sides of the inequality in Lemma |], 
we have 

ipeB ipeB x 

This holds for all ONBs B of C including the worst one B+(C), which is a minimizer for (H), so that 
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With L fixed, we have d n m choices for C. Taking the averages of both sides of the above inequality 
over these choices, we obtain 

a c 

C V6B*(C) x 

x c vee*(c) 

= 

d n ^ 

X 

= ^£ Tr E <y N l a ** N * 

x y,zeJ c 

= ££ki 2 

x y£j c 

= E p ^(y), 

y e,J c 

where we have used the fact that the d n ~ m subspaces C sum to H® n orthogonally for the second equality, 
and the property of error basis N n that Tr N^N Z = d n 5 yz for the fourth equality p7| . Hence, at least, 
one code (C, TZ) has the promised minimum average fidelity. □ 
Example 3. To illustrate the difference between the minimum average fidelity F a and minimum 
fidelity F as well as the significance of Lemma ^, let us consider again the amplitude-damping channel 
discussed in Example 1 and evaluate some small codes on this channel. Let n = 2 and m = 1. In this 
example, we denote a vector (ux, v i, «2, V2) € F 4 simply by U1V1U2V2. Let L = {0000,0101}. Then, 
Nl = {I®I, Z®Z}, and we have two symplectic codes C = span{|00), |11)} and C\ = span{|01), |10}}, 
where 1 00) = |0) (g) |0) and so on. It is easy to check that the cosets of L in F 4 are 

L x = {0000,0101,1010,1111,0001,0100,1011,1110} 

and 

h 1 + L ± = {1000,1101,0010,0111,1001,1100,0011,0110}, 

where hx = 1000. Let n and Ui denote the projections onto C and C\, respectively. Putting 
J = {0000, h x } and J = hx + L = {0000,0101,1000,1101}, we see that both (C ,Ko) and (Ci,Ki), 
where IZo ~ {n , iV^rii} and TZx ~ {IL, iV^rio}, are iVj -correcting as well as iVj-correcting from 
Lemmas [| and [7| or directly from Lemma ||] (recali also the general form of 1Z for a symplectic code was 
given in the the last paragraph of Section [ÏV|) . If we prepare an input state \ip) = x\00) + y\ll) G C , 
then, the fidelity F(ip) = (ip\7l A^ 2 (\'^)(ip\)\'^) can be calculated as 1 — •yyy*. This implies the 
minimum fidelity is F(Cq) = 1 — 7 while the minimum average fidelity is F a (Co) = 1 — t/^- I n a similar 
way, evaluating F{Cx) results in F(Ci) = 1 — 7 and F a (Cx) = 1 — 7/2. One the other hand, the bound 
in Lemma || states F a (C(L)) > 1 — Y^zéJ Pa® 2 { z ) — J2ze.j Pa( z ) = 1 — 37/4, where P\ is the product 
measure obtained from P4 as in ([14]). This is an example for which the inequality in Lemma [| is true 
but that with F a replaced by F fails. □ 
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VI. Proof of Theorem |ï| 

We put P = P^n- Since the bound in the theorem is trivial when k > n—1, we assume m = k+1 < n. 
What we want is a code (T>, TZ) with dimension d k whose minimum fidelity is lower bounded by 
1 — 2d 2 (n + l) 2 ( d2_1 ) d~ nE( - R,p \ To show the existence of such a code, it is enough to prové 

1 - F a (C(L)) < d\n + lf(d 2 -D d -nE(R,P) (15) 

for some L with dimL = n — m = n — (k + 1) and some choice of Jo in Lemma |||, where C(L) 
achieves the maximum of F a (C) = F a (C,TZA) among the d n ~ m symplectic codes associated with L as 
in Lemma [5], since we have Lemma []]. Recali that the probability distribution P^ n for the memoryless 
channel A has a product form as in (0), which is denoted by P n in this proof. 

We employ the method of types lfL5|| , [p8fl , [p9| , p0| , on which a few bàsic facts to be used are 
collected here. For x = (xi, . . . , x n ) G X n , define a probability distribution on X by 

n / \ \{i\í<i<n,Xj = u}\ 

P x (u) = , u e X, 

n 

which is called the type (empirical distribution) of x. With X fixed, the set of all possible types of 
sequences from X n is denoted by Q n {X) or simply by Q n . For a type Q G Q n , Tq is defined as 
{x G X n | Pa; = Q}. In what follows, we use 

\Qn\ < (n + l) m -\ (16) 

where \X\ = d 2 in the present case, and 

VQ G Q n , \Tq \ < d nH ^ Q \ (17) 

Note that if x G X n has type Q, then P n (x) = Uaex p ( a ) nQ{a) = exp d {-n[iï(g) + D(Q\\P))}. 
We apply Lemma |3| choosing J as follows. Since dim L = n — m, we have dim L 1 - = n + m |43 



|50| . From each of the d n ~ m cosets of L 1 - in F 2n , select a vector that minimizes ^(P^,), i.e., a vector a; 
satisfying H{P X ) < H{P y ) for any y in the coset. Let Jq(L) denote the set of the d n ~ m selected vectors. 
This selection uses the idea of the minimum entropy decoder known in the classical information theory 
literature [5S|. Let 

A={LÇ F 2n | L linear, LÇL 1 , dimL = n - m} 
and for each L G A, let C(L) be the best Afj (X)-correcting code in S(L). Putting 

1 1 LeA 

we will show that 1 — F is bounded from above by d 2 (n + l) 2 ^ 2 d~ nE( - R ' p \ which will ensure (0) for 
some L and hence, establish the theorem by the argument at the beginning of this proof. This is our 
second application of the random coding method. 

The {0, l}-valued indicator function 1[T] equals 1 if the statement T is true and equals otherwise. 
From Lemma we have 

1 1 LçAx^J (L) 

LeAxeF 2n 

\B(x)\ 



|A| 



cGF 2 " 
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where we have put 

B(x) = {L G A | x i J (L)}, x G F 2n . 
The fraction |B(x)|/|A| is trivially bounded as 

|B(x 



IAI 



< 1, xe F 2n . (19) 



We use the next lemma, a proof of which is given in Appendix D. 
Lemma 6: Let 



Then, |A(0)| = and 



k(x) = {LeA\xeL ± \ {0}}. 



A(.<-) '/"•"' 1 . 1 _ c2i , 



, < 3 , x G F iri , x ± 0. (20) 



O 



Remarks. Note that A is not empty since any (n — m)-dimensional subspace of 
{(xi, 0, x 3 , 0, . . . , x 2n -i, 0) G F 2n \xi,x 3 ,..., x 2n -i g F} 



is contained in A. This lemma is essentially due to Calderbank et al. p5f who have used it with 
A(x) replaced by {L G A' | x G L -1 \ L} for some A' Ç A to prové the Gilbert-Varshamov-type 
bound for quantum codes. Matsumoto and Uyematsu J27| proved Lemma |^ with A(x) replaced by 
{L G A | x G L 1 - \ L} using the Witt lemma explicitly |43], |[44|| . The present definition of A(x) makes 
the argument easier. □ 

Since B(x) Ç {L G A | 3y G F 2r \#(P,,) < H(P x ),y - x G L x \ {0}} from the design of J (L) 
specified above (cf. |49|), 

|B(s)| < £ |A(y-x)| 

< |A|cT n+m , (21) 

j/eFa^í^P^/^P,), y^x 

where we have used fl2Ti| ) for the latter inequality. Combining (|T8D, (JT9|) and ([21]), we can proceed as 
follows with the aid of the bàsic inequalities in ( JÏ6| ) and (|H]) as well as the inequality min{a + b, 1} < 
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VM+1 



min{a, 1} + min{6, 1} for a, b > 0: 

l-F < pn Í x ) min | cT (n " m) , 1 

zeF 2 ™ l yeF 2n -H{P y )<H(P x ), y^x 

QeQn ae* l Q'eQ„:H(Q')<H{Q) 

< d 2 exp d [-nD(Q\\P)] £ exp d [-n|l - - Jí(Q' 

QeQ„ Q'eQ n :H(Q')<H(Q) 

< d 2 V exp d f-nD(Q||P)l IQJ max exp rf [-n|l - i? - 

jH> Q'eQ n :H(Q')<H(Q) íaV ' V 71 J 

tiri 

= rf 2 J2 \Qn\ex Vd [-nD(Q\\P)-n\l~R-H(Q)\ + ] 

QeQn 

< d 2 \Q n \ 2 exp d { m&x[-nD(Q\\P) - n\l -R-H(Q)\ + }} 

< d 2 (n + l) 2 ^ 2 " 1 ) exp d [-nE(R, P)]. 

This implies at least one L satisfies (|Ï5|) , and the proof is complete owing to Lemma 0. □ 

VII. CONCLUDING REMARKS 

This paper provided evidence, from an information theoretic viewpoint, that Standard quantum error 
correction schemes work reliably in the presence of quantum noise, the effects of which are modeled as 
general completely positive linear maps. What is technically new is evaluating the minimum average 
fidelity over all eigenspaces of a stabilizer Nl, which yields a good estimate for the minimum fidelity 
of codes. The thus obtained fact (Lemma [5D allowed us to derive the main result in a manner familiar 
in information theory. Likewise, based on Lemma |5] and with another classical technique, a high-rate 
improvement, which corresponds to the expurgated bound in classical channel coding, on the exponent 
E(R, P) has already made in |yj after the online distribution of the present work, though it is effective 
only for channels of low noise level and does not improve the capacity bound. 

Although this paper's lower bound on the capacity is the best among those known except for a 
few cases, it is important to recognize that this paper's lower bound is not tight in general. In 
this sense, Shor and Smolin [|Hj, have gone further. Specifically, Shor and Smolin exploited the 
'degeneracy' of error-correcting codes to present a lower bound on the capacity of the depolarizing 
channel A ~ {y/1 — p I, y/p/3X, ^p/3XZ, y/p/3 Z} such that their bound is positive while the bound 
1 — H(Pj) = 1 — h(p) — plog 2 3 becomes negative for restricted vàlues of p, where h is the binary 
entropy function. The degeneracy concept is somewhat misleading because a single quantum code 
can be regarded as both degenerate and nondegenerate as is clearly understood from the next lemma, 
which is a refinement of Lemma 

Lemma 7: As in Lemma [| assume a subspace L Ç F 2n and J satisfy (|TCD and (|IBD, respectively. 
Put 



J = {z + w | z G Jo, w G L}. 

Then, the condition ( |TT]) is fulfilled, so that the d n ~ m codes of the form flT^D are <i m -dimensional 
A^j-correcting codes. O 

If an Aj/-correcting code is given and {M\ip) | M G Nj>} is not linearly independent for a state 
in the code space, then the code is called degenerate JïïJ. The codes in Lemma [7] are nondegenerate 
Ajg-correcting codes while they are degenerate Aj-correcting codes. In this paper, we have evaluated 
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nondegenerate iVj -correcting codes with | Jq\ = d n ~ m , but actually |J| = rf 2 ( n - m ) in this case. Hence, 
the codes can correct more errors than those evaluated in this paper. Suggestions for developing Shor 
and Smolin's result can be found in the final section of [|J. 

Shor and Smolin's result does not deny the possibility of the tightness of this paper's bound for all 
channels. Extending this work's result to the case of channels with memory of a Markovian nature 
is possible if second-order (or higher-order) types are used instead of the usual types [j52] . It may be 
also interesting to ask whether the present approach will help us obtain bounds or improve the known 
ones for Gaussian quantum channels already discussed in the literature f53fl , |54| , f55 . 
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Appendices 

A . Proof of Proposition [Z| 

In this proof, we assume d = 2 for notational simplicity. The proof readily extends to the case where 
d > 2. First, we show that the maximum of 1 — H(Pua,n) with the restriction U = I is achieved by 
the indicated N. For M : L(H® 2 ) -> L(H® 2 ) and 4 x 4 matrices M over C, we write M ~ M if M is 
the matrix of Ai with respect to the basis {\b b ), \b bi), l^i^o); \bibi)}, where \b b ) = \b ) £g> |6 ) and 



so on. We use the next lemma due to Choi 

Lemma 8: [2^] A linear map A : L(H) — > L(H) is completely positive if and only if [I®^4.](| < í )+ )( ( í >+ |) 



is positive, where I is the identity map on L(H), and 



|$+> = -L(|6 6 > + |6 1 6 1 ». 



Moreover, if we represent [I <S> ^4](|$ + )( ( í >+ |) as 

[I^](|$ + >($ + |)^$>ta. (22) 



2 



and rearrange the elements of a x = (a X fio,a Xi oi,a x io,a x ,u) G C into the matrix form 



A x 



«2,00 üxfil 
a xA0 O-xAl 



x G X, 



we obtain an operator-sum representation oí A: A ~ {A x } , where A x : L(H) — > L(H) is the Hermitian 
adjoint operator of j) £ x^x,ij\bi) (bj\, i.e., the adjoint of the operator whose matrix is A x , x G X . O 
Remark. The correspondence £ : C 4 — > L(H) that has sent a x to A x is explicitly written as 

□ 

If we define an inner product (•, •) on L(H) by (N, M) = 2 _1 Tr iV^M (half the Hilbert-Schmidt inner 
product), then {N x } x& x is an orthonormal basis with respect to this inner product, and hence P = P4 
in Theorem [I] is rewritten as 

|2 



P(y) = J2\(N y ,A x ) 



x€X 
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In fact, one sees that P(y) has a physical meaning as follows. If we define an inner product between n = 
("00, "01, nio, nu) and m = (m 00 ,m 01 ,m w ,m n ) by (n,m) = 2" 1 ]T zeA . n z m*, then (f(n),£(ra)) = 
(n, m), so that we have 

P(y) = ^IK,aX 

where C( n y) = -^V Now, imagine we perform the orthogonal measurement {2~ l n y n y } ye x on the 
system in the state (f22"|). Then, we obtain the result y with probability 

xex 

xeX 
xeX 

= P(y)- 

Then, from the property of von Neumann entropy ]57]], H(P) is not smaller than the von Neumann 
entropy of the state (|22"D and equals it when n x is proportional to a x for each x e X, which is fulfilled 
by setting |0) = |&o) and |1) = \bi) (and u = uj for d > 2). To complete the proof, we have only to 
notice that any unitary map preserves the entropy of the state that it acts on, which implies H(P) 
does not decrease by preprocessing of applying I <g> U. to [I <8> ^4](|<í> + )($ + |). □ 

B. Comparíson of Bounds 

In this appendix, we prové @, which states that our bound 1 — H(Pk^) is not smaller than the 
previously known one 1 — Hi(p'), and then, calculate 1 — Hi(p') for the amplitude-damping channel as 
an example. Putting \Òq) = |0) and |6i) = |1) (and hence viewing state vectors in terms of the basis 
{|00), |01), |10), |11)}), we shall use the argument in the previous appendix, which applies to general 
CP maps A except the last paragraph. 

First, we prové (^|). As argued by Bennett et al. [§, p. 3830], every maximally entangled state can be 
represented, up to an overall phase factor, as the transpose of (u+iv , w+iz, — w+iz, u—iv) with u, v, w, z 
real, i.e., as (x, y, -y*,x*) 1 , where xx* + yy* = 1/2. Suppose \r]) = x\00) + y\01) - y*\!0) + 
achieves the maximum in (||). Then, putting u = \/2{x*, y*, —y, x), this maximum can be written as 

1 t t 
—u- > ala s u' 

2 2 ^ s 

sex 

sex 
sex 

= ^^Trí/t^l 2 

sex 

= J]|(/,f/U s )| 2 

sex 

= Pua(M) 
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where U = £(«), andW(p) = C7+pZ7 (note that U is unitary). Hence, 1-H{P UA ) > l-H x (l - P w ^((0,0))). 

and the inequality is strict unless PuaÍO-í 0)) = PuavÍPí 1)) = PuaÍÍXi 1)) by the property of the Shan- 
non entropy H . □ 
Example 4- We have calculated 1 — H(Pj) for the amplitude-damping channel in Example 1. For 
comparison, we compute 1 — H\ (p') with (|5|) for this channel. For the operator-sum representation in 
Example 1, we have ct(o,o) = (1 3 0, 0, y/1 — 7) and O(i,o) = (0, 0, y/j, 0). Hence, the maximized quantity 
in (|5p can be calculated as 

-u- a l a * uí = 7/4 + (1 - 7 + \A-7K + (1 - 7 - x/l-l)v\ 

s=(0,0),(l,0) 

where u = Re 2 and t> = Imx. From the normalization constraint < u 2 + v 2 < 1/2, it follows that 
the maximum is (2 — 7 + 2y/l — and hence, p' = 1 — (2 — 7 + 2y/l — t)/4. □ 

C. Proof of Lemma |J 

We employ the recovery operator 7?. ~ {(9} U {i? r } constructed in the proof of Theorem III. 2 of 



26 1 as well as the notation therein, where in the present case their {A a } are to be read {N x }. Since 



the conditions (19) and (20) in Theorem III. 2 of p6[ can be restated without referring to the code 
basis {\0l), . . . , \(K — 1)l)} (see, e.g., J37J, |58H), we can assume = \0l) without loss of generality. 
Suppressing the superscript of Ax and using the relations R r = V r ^2 i \v % r ){v % r \ and V r \v % r ) = \íl) p6|j , 
we have 

F W = J2T,^ R r A *\°à(0L\AÍRÍ\0 L } 

r x 

= EE^V^Xo^k ) 

r x 

= ^<0 l |4IIoÀb|0l), 

X 

where we have put H = ^ r |z/£)(z/*|, < i < K — 1. Also we put IÍk = O — I — Y^o<í<k-i ^h- Thus, 

i-Fty) = Yl E<°^ n ^l°^ 

l<i<K x 

= EEE a;^ 2 (o i |ivtn l Ago L > 
= E E^l^ n ^l ^) 

1<Í<K X 

< ^(OzIS+S.IOl), 

X 

where ^ = £) yeJe a^iVj,. □ 
D. Proof of Lemma 

That |A(0)| = is trivial. The lemma follows if we show that |A(x)| = |A(y)| for any two distinct 
nonzero vectors x and y. This is because if it is so, putting M = |A(x)|, x 7^ 0, and counting the pair 
(x, L) such that x e L e A and x ^ in two ways, we will have (d 2n - 1)M = \/K\(d n+m - 1). To 
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prové |A(x)| = |A(y)|, we use the Witt lemma, which states that for a space V with a nondegenerate 
(nonsirigular) symplectic form and subspaces U and W of V, if an isometry (an invertible linear map 
that preserves the inner-product) a from U to W exists, then at can be extended to an isometry from 
F 2n onto itself [44, p. 81], [43, Theorem 3.9]. First, note that any linear map from the space span{x} to 
span{?/} preserve the symplectic inner product (||), which always equals on these spaces. Among such 
maps, we choose the isometry at with y = a(x). Then, by the Witt lemma, a can be extended to F 2n . 
Since L G A(x) implies at(L) e A(y), we have |A(x)| > |A(y)|; since L e A(y) implies a~ l (L) e A(x), 
we have |A(x)| < |A(y)|. Hence, |A(x)| = |A(y)|, establishing the lemma. □ 
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